An Innovative Algorithm for Feature Selecton Based on Rough Set with Fuzzy C-means Clustering
نویسندگان
چکیده
Feature selection is a fundamental problem in data mining, especially for high level dimensional datasets. Feature selection is a process commonly used in machine learning, wherein subsets of the features from the original set of features are selected for application of a learning algorithm. The best subset contains the minimum number of dimensions retaining a suitably high accuracy on classifier in representing the original features. The objective of the proposed approach is to reduce the number of input features thus to identify the key features of breast cancer diagnosis using fuzzy c-means clustering (FCM), K-nearest neighbors (KNN) and rough set. The results show that the hybrid method is able to produce more accurate diagnosis and prognosis results than the full input model with respect to computational complexity and classification accuracy.
منابع مشابه
A hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts
High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...
متن کاملA Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach
In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...
متن کاملA Fuzzy C-means Algorithm for Clustering Fuzzy Data and Its Application in Clustering Incomplete Data
The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal fuzzy data. Then, as an application, a method based on the proposed algorithm is pres...
متن کاملEnforcement of rough fuzzy clustering based on correlation analysis
Clustering is a standard approach in analysis of data and construction of separated similar groups. The most widely used robust soft clustering methods are fuzzy, rough and rough fuzzy clustering. The prominent feature of soft clustering leads to combine the rough and fuzzy sets. The Rough Fuzzy C-Means (RFCM) includes the lower and boundary estimation of rough sets, and fuzzy membership of fuz...
متن کاملBilateral Weighted Fuzzy C-Means Clustering
Nowadays, the Fuzzy C-Means method has become one of the most popular clustering methods based on minimization of a criterion function. However, the performance of this clustering algorithm may be significantly degraded in the presence of noise. This paper presents a robust clustering algorithm called Bilateral Weighted Fuzzy CMeans (BWFCM). We used a new objective function that uses some k...
متن کامل